Statistics of lines of natural images and implications for visual detection 
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As borders between different regions, lines are an important element of natural images. Already 
at the level of the mammalian primary visual cortex (VI), neurons respond best to lines of a given 
orientation. We reduce a set of images to linear segments and analyze their statistical properties. 
In particular, appropriately defined Fourier spectra show more power in their transverse component 
than in the longitudinal one. We then characterize filters that are best suited for extracting infor- 
mation from such images, and find some qualitative consistency with neural connections in VI. We 
also demonstrate that such filters are efficient in reconstructing missing lines in an image. 



An image on a screen is represented by a set of intensi- 
ties at each pixel. The photoreceptors of the retina also 
respond to the intensity of light arriving from specific 
directions. However, when it comes to interpreting the 
content of an image, primary clues are the borderlines 
between different regions. Indeed, already at the level 
of the mammalian primary visual cortex (VI), neurons 
respond best not to points of light, but to lines of partic- 
ular orientation 1]. It is thus important to inquire about 
the statistics of lines in natural scenes, and implications 
for vision. In Ref. PI, such a study is performed by first 
converting images to a set of lines: Correlations of a pair 
of such lines with their relative location in space, indi- 
cates a tendency towards co-circularity, namely the most 
likely arrangement of the two segments is to lie along a 
circular arc joining them. We start with a similar de- 
composition of images to lines, examine their statistics 
(e.g. by Fourier transformation), and explore their im- 
plications for visual detection. 

There are previous studies of the power spectrum of 
the (scalar) intensity correlations of natural images H,|j|, 
which find indications of scale invariance. For a vecto- 
rial quantity, a natural decomposition is into longitudi- 
nal/transverse Fourier components, which measure the 
variations parallel/perpendicular to a wavevector k. Such 
decomposition is for example quite common in studies of 
turbulent velocity fields |a,l3|. We construct similar mea- 
sures of variations of the lines in natural images (which 
unlike a vector field do not point to a specific direction) , 
and find enhanced power in the orthogonal (transverse) 
channel. We designate this feature, related to the preva- 
lence of sharp lines, the 'transversality' of natural images. 

Since the task of the visual cortex is to decipher visual 
signals, its design is likely to depend upon statistics of 
natural images. The visual input from the retina is car- 
ried by the optic nerve to the lateral geniculate nucleus 
(LGN), and then transferred to VI. A prominent feature 
of neurons in VI is that they respond most strongly when 
viewing lines of a specific slant. This orientation prefer- 
ence (OP) is thought to arise from the arrangement of 
the feed- forward connections from the LGN [3|. However, 
within VI there are also horizontal connections (extend- 
ing for 2-5 mm) which mostly link columns of neurons 
with similar OPs J3, l3| ■ Staining experiments with in- 



jected biochemical tracers in the tree shrew reveal that 
these lateral connections are longer and stronger along 
an axis in the map of visual field that corresponds to the 
preferred orientation of the injection site Q- Similarly, 
in the cat visual cortex, facilitatory effects occur only be- 
tween neurons which are co-axial in the spatial domain 
and co-oriented in the orientation domain [lO| . 

Although less understood than the feed-forward con- 
nections from LGN, the long range connections in VI are 
presumed to mediate the global integration of an image 
from its local elements. Evidence supporting this comes 
from fMRI investigations in monkeys and humans: The 
neurons in VI show higher response when viewing a long 
extended line, compared to randomly oriented segments 
of the line [IJ]. Here, we address the characteristics of the 
lateral connections from the perspective of information 
theory [H El El IH El 113 • Using the two point corre- 
lation functions for lines in natural images, we construct 
long-range filters that are optimally suited for harvesting 
visual information. We find that the strongest connec- 
tions are between neurons with a common OP directed 
along the line joining the visual field locations of the neu- 
rons, as observed in the cortex of cat and tree shrew. 

The long-range connections that maximize information 
reinforce the local (feed-forward) input to a neuron. If 
the local signal is for any reason corrupted, the global 
information can help to reconstruct it. Indeed psycho- 
physical tests show the facility of the brain to recognize 
missing segments in an image [lq|. To mimic this effect, 
we construct filters that are optimally suited to study 
images composed of directed lines. Since most of the in- 
formation is in the 'transverse' channel, these filters have 
a transverse character. We demonstrate that transverse 
filters perform much better than isotropic ones in recon- 
structing missing gaps in simple images. 

To obtain statistics of lines, we start with a col- 
lec tion of black and white pictures from a database, 
' http://hlab.phys.rug.nl/imlib/index.html," [l^ which 
includes trees, buildings, flowers, leaves, and grass. 
The data, which is in the form of a scalar intensity at 
each pixel, is then converted into oriented segments 
[sx{X),Sy{X)] at each pixel X, using filters based on 
the second derivative of a Gaussian and its Hilbert 
transform'20'|. Since [sa;,Sy] and [— s^,,— Sj,] describe 



the same orientation, we introduce the tensor field 
Sq/3(^) = •Sa(^)s/3(X), which is invariant under re- 
flection. The two dimensional Fourier transforms of 
the components of this tensor lead to a corresponding 
^apik). The longitudinal and transverse components of 
the tensor are then obtained as 



Se{k) = tr L(fc)S(fc) 



St{k)=tY T(fc)S(fc) , (1) 



with the aid of the projection operators 

i^afAk) = ka kfj, Tal3{k) = [6al3 - ka k/i], (2) 

where k is the unit vector in the direction of k. 

Figuresn](a) and (b) show the power spectra Su{k) = 
\Siik)\^ and Stt{k) = \St{k)\^ after averaging over 100 
images. Clearly these quantities are not isotropic and 
vary with angle. This is due to the predominance of ver- 
tical and horizontal segments in the images. The bias 
of oriented segments along cardinal directions in natural 
scenes is well known |21j , and a similar bias exists in the 
OPs of cortical maps from adult ferret and cat [23, 123 ■ 
There is a corresponding larger area of VI devoted to ver- 
tical and horizontal orientations, and a greater stability 
of cardinal neurons to changes of orientation 24^ . Since 
we are not interested in the predominance of specific ori- 
entations, we remove this anisotropy by averaging over 
rotated images ^5]. Equivalently, we can average the 
power spectra in Fig. ^ over all angles, resulting in Su 
and Stt as a function of |fc|, as depicted in Fig.^c). 

The data in Fig. ^ clearly shows higher power in the 
transverse component. As with the intensities of natu- 
ral images Q, the power spectra are reasonably close to 
a power-law form l/k'^~^, presumably refiecting an un- 
derlying scale invariance since objects can appear at any 
distance from the viewer. (The straight line in Figs^^c) 
corresponds to ?7 = 0.) We believe that the deviations 
from scale invariance (especially pronounced in the trans- 
verse component) are an artifact of our images. Convert- 
ing intensity data to orientations involves filters with an 
inherent short distance scale; at such short scales the 
two power spectra coincide as required by local isotropy. 
There is a range of intermediate scales in which both 
spectra can be fitted to power laws. The deviations from 
scale invariance at shortest wavelengths are likely due to 
a tendency to frame photographs to include whole ob- 
jects, excluding images with parts of objects extending 
beyond the frame |26|. 

The enhanced transverse power is a consequence of the 
abundance of sharp and extended edges in natural im- 
ages. An elementary illustration is obtained by compar- 
ing a long straight line with a horizontal arrangement of 
short vertical segments as in a fence. The former has no 
longitudinal Fourier component while the latter has weak 
transverse character. Searching for other sets of images 
with different statistics, we did a sampling of paintings 
from modern art. We find that many paintings from 
the impressionist school with blurred lines have approx- 
imately equal transverse and longitudinal powers. By 
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FIG. 1: Intensity plots of the longitudinal Suik) (a), and 
transverse Stt{k) (b), power spectra obtained from averaging 
over a set of 100 natural images, (c) Log-log plots of Su{k) 
and Stt{k) after averaging over all angles. 



contrast, cubist paintings with sharp lines share (arid in 
fact exceed) the transversality of natural images [23| ■ 

We next attempt to relate the above statistics to the 
lateral connections between neurons of VI, using infor- 
mation theoretic methods. The general idea is to con- 
struct an output signal by removing redundant correla- 
tions of the input signals as much as possible, maximizing 
the entropy of the output. Information theory has been 
used to describe early visual processing, such as the con- 
trast response of large monopolar cellsOa, the 'center 
surrounded' receptive fields in the retina |13l Il4l | . and the 
white spatial/temporal power spectrum of signals from 
the LGN[ll|l3. In Ref. jQ, filters for processing inten- 
sity inputs to VI were calculated by maximizing infor- 
mation subject to certain costs. Our approach is based 
on the latter, and as extended in Ref. \iT\ . but employing 
an input signal which is an orientation field. 

The response of simple cells in VI is primarily to an 
oriented line in a preferred direction, which we shall 
approximate by tr[t(af)s(X)] — [t{x) ■ s{X)]'^. Here 
Sct^(A") = Sa{X)sfj{X) is constructed from the orienta- 
tion of the image segment (input signal) at position X in 
the visual field, while a tensor taf3{x) = ta(x)tii{x) is de- 
fined in terms of the OP of a neuron at location x in VI. 
The topographic map between the visual field and VI 
provides a mapping between x and X. However, to em- 
phasize that this mapping is not one to one, with many 
VI neurons responding to signals at the same position in 
the visual field, we use two symbols X and x. 



Our main interest is in the lateral connections to a cell 
from other neurons in VI. With this aim, we indicate 
the net response (neuron firing rate), by 

0{x) = tr[t(f )s(X)] + / d^yF{x, y)tv[t{y)s{Y)] + 77(f). 



(3) 
The 'filter function' F{x, y) denotes the strength of the 
horizontal connection between the neurons at x and y; 
ri{x) is the noise experienced by the neuron which is 
assumed to be uncorrelated at different points, with 
{r}{x)r}{x')) = a^5'^{x — x'). Given the stochastic na- 
ture of the input signal (as well as the noise), the output 
0{x) is a random variable with a (joint) probability dis- 
tribution p[0{x)\. The associated Shannon information 
is 

/ = -(lnp[0(f)]) « i In dei[{0{x)0{x')),], (4) 

where {0{x)0{x'))c is the second cumulant (co- variance) 
of the output. The final approximation assumes a Gaus- 
sian p[0(f )] , and ignores higher order cumulants. For low 
signal to noise ratio we can further simplify the result to 



I K. ^ j d^X SafJ-^si^) ^ali(x)t^s(x) (5) 



d^x / d'^y F{x,y)Sa[i'ys{X - Y)tap{x)t^s{y), 



where Sap^siX — Y) = {Safj{X)s^s{Y))c/a-^ denotes the 
CO- variance of the input signal. 

To provide a meaningful comparison of different filters, 
we need to maximize the above information subject to 
costs and constraints. In particular, it is reasonable to 
assume that an expansion of the wiring costs for small 
F starts at quadratic order (so that no connections are 
formed in the absence of any gain). Following Refs. \wi 
Il7| . we introduce a cost function 



C[t,F]=Ci[t] + ^ld^xd^yC2{x-y)F{x,yf, 



(6) 



where C2{r) is a cost for connecting neurons at a sep- 
aration r. We would like to maximize I — C with re- 
spect to both t{x) and F{x, y). The largest contribution 
should come from the local OPs encoded in t(a;). How- 
ever, this is not our concern here, and for this reason 
we have not dwelled on the precise form of Ci [t] . Given 
that the field tap{x) has somehow been established, we 
would like to determine F{x,y). Assuming that the lat- 
ter connections provide a small correction to the overall 
information, maximization gives 



F{x,y) 



ta/3{x)Sai3js{X - Y)tjsiy) 

C2{x-y) 



(7) 



The optimal connection between two VI neurons thus 
depends on their OPs, and the correlations in natural 
signals at the corresponding locations and orientations. 




FIG. 2: The strength of horizontal connections among neu- 
rons with parallel OPs (solid line _F||), and with orthogonal 
OPs (dotted line -Fx), as a function of their angle </? to the 
line between their locations in the visual field. The results 
are for a fixed separation, and obtained from the statistics of 
lines in a set of five images of trees. 



This qualitatively agrees with the observations in tree 
shrew 9j and cat lOj. To confirm that Eq. Q does 
indeed predict the enhanced horizontal connections be- 
tween colinear and co-oriented neurons, we measured the 
two point correlation functions by averaging over a set of 
five images of trees. Figure |2 compares the strength of 
the connection among neurons with parallel OPs (Fy) to 
that of neurons with orthogonal OPs {F±), as a function 
of the angle ip between one of the OPs, and the line join- 
ing their locations in the visual field. The figure is for 
a constant separation \x — y\; the angular dependence is 
not very sensitive to this separation. There is a strong 
maximum in i^y at colinearity (/? = 0; while F±^ (which is 
always smaller than Fy) shows weak maxima at 7r/4 and 
37r/4 (consistent with the co-circularity principleQl). 

One advantage of optimal filters is observed by noting 
that the resulting noise-average output of a neuron is 



0(x) = tapix) 



SapiX) 



d y- 



ial3{X)s^s{Y))c 

C2{x-y) 



Sysiy) 



If the primary signal Sa{X) is somehow corrupted, the 
connections provide a guess based on global statistics. 
Let us employ similar principles to construct artificial 
algorithms for visual detection, which (like the human 
brain) are adept at deducing global shapes in an image 
composed of lines. As an alternative to Eq. Q which 
avoids introduction of an OP field, we define a vectorial 
output whose components are 



Oaix)^ / d^yFapix 



y)spiy) + Vaix). 



(8) 



The filter is now a 2 x 2 matrix. As in Eq. Q its Fourier 
transform can be projected into longitudinal/transverse 



parts as 



F^liik) = L„^(/c)F,(fc) + T^isik)Ftik). 



(9) 



Now consider a set of iraages in the form of a vector 
field s{x), which is statistically invariant under transla- 
tions. For low signal to noise, the Shannon information 
in the output is 



_fk_ 
(2^ 



\Fiik)\^Suik) + mk)\^Sttik) , (10) 



with projected signal correlations defined as in Eq. jnj. 
As before, we can search for filters that maximize in- 
formation subject to specified costs. However, to sim- 
plify matters we note that the transverse and longitudi- 
nal channels can be treated independently, and that most 
of the information is in the transverse channel which has 
the larger signal power spectrum. As such, we compared 
the performance of the following filters: (1) A trans- 
verse filter with Ft{k) = (j>{k) and Fg{k) = 0; and (2) an 
isotropic filter with Ft{k) = Fg{k) — (f>{k)/\2. In both 
cases, we chose 4>{k) ex exp(— A;^/4C). The input image, 
illustrated in Fig. Ota) consists of vectors, some point- 
ing randomly (noise), and some arranged into a line with 
a gap (corrupted image). Figures |3Ib)-(c) indicate how 



well the filters reconstruct the missing part. The out- 
put of the transverse filter is both stronger and better 
oriented to the erased line. (Detailed results quantifying 
the improvements shall be reported elsewhere.) 












(a) 



(b) 



(c) 



FIG. 3: (a) A test image of a directed line with a gap 
(plus noise). Reconstructions of the missing segment, with 
an isotropic filter (b); and with a transverse filter (c). 
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We confirmed that the spectra become more isotropic as 

we average over more rotated images. Note that with a 

matrix Sq^ obtained from an orientation field, there is no 

a priori reason for the cross correlations Sit{k) and Sti{k) 

to be zero. We do find that these correlations are small, 

and also decrease as we average over rotated images. 

This was tested by generating random lines within one 

frame. As the length of lines increases, the meeting point 

between the two spectra is shifted to smaller k. 

Additional pictures and data are available online from 

,http://www.mit.edu/^kardar/research/transversality/ModernArt/| 



